Attention-based Memory Selection Recurrent Network for Language Modeling

نویسندگان

Da-Rong Liu

Shun-Po Chuang

Hung-yi Lee

چکیده

Recurrent neural networks (RNNs) have achieved great success in language modeling. However, since the RNNs have fixed size of memory, their memory cannot store all the information about the words it have seen before in the sentence, and thus the useful longterm information may be ignored when predicting the next words. In this paper, we propose Attention-based Memory Selection Recurrent Network (AMSRN), in which the model can review the information stored in the memory at each previous time step and select the relevant information to help generate the outputs. In AMSRN, the attention mechanism finds the time steps storing the relevant information in the memory, and memory selection determines which dimensions of the memory are involved in computing the attention weights and from which the information is extracted. In the experiments, AMSRN outperformed long short-term memory (LSTM) based language models on both English and Chinese corpora. Moreover, we investigate using entropy as a regularizer for attention weights and visualize how the attention mechanism helps language modeling.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Sentence Interaction Network for Modeling Dependence between Sentences

Modeling interactions between two sentences is crucial for a number of natural language processing tasks including Answer Selection, Dialogue Act Analysis, etc. While deep learning methods like Recurrent Neural Network or Convolutional Neural Network have been proved to be powerful for sentence modeling, prior studies paid less attention on interactions between sentences. In this work, we propo...

متن کامل

Inner Attention based Recurrent Neural Networks for Answer Selection

Attention based recurrent neural networks have shown advantages in representing natural language sentences (Hermann et al., 2015; Rocktäschel et al., 2015; Tan et al., 2015). Based on recurrent neural networks (RNN), external attention information was added to hidden representations to get an attentive sentence representation. Despite the improvement over nonattentive models, the attention mech...

متن کامل

Feedforward Sequential Memory Neural Networks without Recurrent Feedback

We introduce a new structure for memory neural networks, called feedforward sequential memory networks (FSMN), which can learn long-term dependency without using recurrent feedback. The proposed FSMN is a standard feedforward neural networks equipped with learnable sequential memory blocks in the hidden layers. In this work, we have applied FSMN to several language modeling (LM) tasks. Experime...

متن کامل

Autoregressive Attention for Parallel Sequence Modeling

We introduce an autoregressive attention mechanism for parallelizable characterlevel sequence modeling. We use this method to augment a neural model consisting of blocks of causal convolutional layers connected by highway network skip connections. We denote the models with and without the proposed attention mechanism respectively as Highway Causal Convolution (Causal Conv) and Autoregressive-at...

متن کامل

Detecting Hazardous Events from Sequential Data with Multilayer Architectures

Multivariate time series data play an important role in many domains, including real-time monitoring systems. In this paper, we focus on multilayer neural architectures that are capable of learning high level representations from raw data. This includes our previous solution based on Recurrent Neural Networks with Long Short-Term Memory (LSTM) cells. We build upon this work and present improved...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

CoRR

دوره abs/1611.08656 شماره

صفحات -

تاریخ انتشار 2016

Attention-based Memory Selection Recurrent Network for Language Modeling

نویسندگان

چکیده

منابع مشابه

A Sentence Interaction Network for Modeling Dependence between Sentences

Inner Attention based Recurrent Neural Networks for Answer Selection

Feedforward Sequential Memory Neural Networks without Recurrent Feedback

Autoregressive Attention for Parallel Sequence Modeling

Detecting Hazardous Events from Sequential Data with Multilayer Architectures

عنوان ژورنال:

اشتراک گذاری